shared experience actor-critic
Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents. Our proposed algorithm, called shared Experience Actor-Critic(SEAC), applies experience sharing in an actor-critic framework by combining the gradients of different agents. We evaluate SEAC in a collection of sparse-reward multi-agent environments and find that it consistently outperforms several baselines and state-of-the-art algorithms by learning in fewer steps and converging to higher returns. In some harder environments, experience sharing makes the difference between learning to solve the task and not learning at all.
Review for NeurIPS paper: Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Additional Feedback: I like authors tried their experiments in various perspectives, but experience sharing is occasionally seen from the existing literature. For example, although it wasn't mentioned in the paper, [1] used experience sharing among agents for their implementation, and I believe there may be other works with the topic of "MARL for homogeneous agents". The main reason I score "below acceptance" is that quite weak baselines seem to be used: - In Table 1, QMIX and MADDPG highly underperforms SEAC and other baselines (IAC, SNAC). However, since methods with CTDE are mostly more stable than independent learning methods, I think this part should be explained in more detail. Although other reviewers have argued the strength of this work from the importance weighting and simplicity of methods, I still think there should have been stronger baselines.
Review for NeurIPS paper: Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
This paper introduces a simple idea for MARL, using importance weights to correct for off-policy. Generally, the reviewers agree that the paper is clear and well written. Although the main idea is very natural and intuitive, as pointed out by reviewer 4, it is not intuitive that is would actually work. Therefore, one of the strengths of this paper is to show that intuition fails us in this case. The reviewers point out some weaknesses in the empirical sections, in particular comparisons with other methods, and we hope that the authors will be able to address some of these in the final version of the paper.
Shared Experience Actor-Critic for Multi-Agent Reinforcement Learning
Exploration in multi-agent reinforcement learning is a challenging problem, especially in environments with sparse rewards. We propose a general method for efficient exploration by sharing experience amongst agents. Our proposed algorithm, called shared Experience Actor-Critic(SEAC), applies experience sharing in an actor-critic framework by combining the gradients of different agents. We evaluate SEAC in a collection of sparse-reward multi-agent environments and find that it consistently outperforms several baselines and state-of-the-art algorithms by learning in fewer steps and converging to higher returns. In some harder environments, experience sharing makes the difference between learning to solve the task and not learning at all.